Skip to content
This repository was archived by the owner on May 15, 2025. It is now read-only.

feat: Add scripts for kubernetes dev env using vLLM and vLLM-p2p #60

Merged
kfirtoledo merged 5 commits intoneuralmagic:devfrom
kfirtoledo:dev
Apr 29, 2025
Merged

feat: Add scripts for kubernetes dev env using vLLM and vLLM-p2p #60
kfirtoledo merged 5 commits intoneuralmagic:devfrom
kfirtoledo:dev

Conversation

@kfirtoledo
Copy link
Copy Markdown

Add support for Kubernetes environment development using GIE with KGateway and vLLM
This PR introduces support for the vllm mode, enabling integration testing of GIE with vLLM.
It also adds support for the vllm-p2p mode, which includes:

  1. Deployment of Redis and LMCache alongside the vLLM image
  2. Peer-to-peer (P2P) communication between vLLM instances
  3. Use of the EPP image to enable kv-cache-aware routing

@kfirtoledo kfirtoledo added help wanted Extra attention is needed WIP labels Apr 25, 2025
@kfirtoledo kfirtoledo changed the title feat: add scripts for kubernetes dev env using vLLM and vLLM-p2p feat: Add scripts for kubernetes dev env using vLLM and vLLM-p2p Apr 25, 2025
Copy link
Copy Markdown
Member

@shaneutt shaneutt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking great. Most of my comments are smaller, but I do have some questions for other folks as to what effect this will have.

Also, cc @elevran @shmuelk who I think should take a look.

Comment thread deploy/components/inference-gateway/inference-models.yaml Outdated
Comment thread deploy/components/vllm-p2p/deployments/redis-deployment.yaml Outdated
Comment thread deploy/components/vllm-p2p/deployments/vllm-deployment.yaml Outdated
Comment thread deploy/components/vllm/deployments.yaml
Comment thread deploy/components/vllm/kustomization.yaml
Comment thread deploy/components/vllm-p2p/kustomization.yaml
Comment thread deploy/environments/dev/kubernetes-kgateway/gateway-parameters.yaml Outdated
Comment thread deploy/environments/dev/kubernetes-kgateway/kustomization.yaml
@@ -0,0 +1,11 @@
apiVersion: kustomize.config.k8s.io/v1beta1
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh ok, I see what you're doing with the naming now. The difference now is that any one of these deployments is deploying only a working VLLM stack, and then you have to deploy your inference-gateway stack separately.

cc @tumido @Gregory-Pereira @vMaroon just wanting to check with you on how this will work with your Helm chart?

Comment thread scripts/kubernetes-dev-env.sh Outdated
@shaneutt shaneutt requested a review from shmuelk April 25, 2025 16:15
Comment thread deploy/components/vllm-p2p/deployments/redis-deployment.yaml Outdated
Comment thread deploy/components/vllm-p2p/deployments/vllm-deployment.yaml
@kfirtoledo
Copy link
Copy Markdown
Author

@shaneutt , PTOL.

@kfirtoledo kfirtoledo removed help wanted Extra attention is needed WIP labels Apr 27, 2025
Comment thread DEVELOPMENT.md Outdated
Comment thread DEVELOPMENT.md Outdated
Comment thread DEVELOPMENT.md Outdated
Comment thread DEVELOPMENT.md
Comment thread DEVELOPMENT.md
Comment thread deploy/components/vllm/kustomization.yaml Outdated
Comment thread deploy/components/vllm/kustomization.yaml Outdated
Comment thread deploy/environments/dev/kubernetes-kgateway/gateway-parameters.yaml Outdated
Comment thread deploy/environments/dev/kubernetes-kgateway/kustomization.yaml Outdated
Comment thread scripts/kubernetes-dev-env.sh Outdated
@kfirtoledo
Copy link
Copy Markdown
Author

@shaneutt and @elevran PTAL

@shaneutt shaneutt self-requested a review April 28, 2025 12:50
Comment thread DEVELOPMENT.md Outdated
Comment thread deploy/components/vllm/deployments.yaml
Comment thread deploy/environments/dev/kubernetes-vllm/vllm/kustomization.yaml Outdated
Copy link
Copy Markdown
Member

@shaneutt shaneutt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving to unblock.

Once @elevran is 👍, I'm 👍

@elevran
Copy link
Copy Markdown
Collaborator

elevran commented Apr 29, 2025

@kfirtoledo LGTM, any idea on the CICD failure?

…tup for kvcache-aware)

Signed-off-by: Kfir Toledo <kfir.toledo@ibm.com>
Signed-off-by: Kfir Toledo <kfir.toledo@ibm.com>
Signed-off-by: Kfir Toledo <kfir.toledo@ibm.com>
Signed-off-by: Kfir Toledo <kfir.toledo@ibm.com>
Signed-off-by: Kfir Toledo <kfir.toledo@ibm.com>
@kfirtoledo kfirtoledo merged commit f67cc34 into neuralmagic:dev Apr 29, 2025
1 check failed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

OpenShift Dev Environment - Full Gateway+GIE Stack Deployment with VLLM and VLLM-P2P mode

5 participants